ABSTRACT

Continuous stream transactions like network monitoring, retail market data analysis and stock market prediction need the “frequent patterns” to be detected recurrently. Literature suggests that several pattern mining solutions are being developed over years. Still lot of challenges need to be addressed due to rapidness in generation of continuous, unbounded and ordered data real time. Hence extraction of frequent patterns from recent data will improve the analysis of stream data. In this article, a new landmark window model CIP (candidate indexing and pruning) is considered for mining the datasets. CIP allows us to mine over entire history of data streams, which improves the accuracy. This article also proposes the candidate indexed sub (CIS)-tree scheme to extract the essential information from each incoming transactions of data streams. Our proposal is compared with the existing “improved data stream mining” (ISDM) for maximal frequent itemsets algorithm. Extensive experimental analyses prove the superiority of the proposed CIP over the popular ISDM in terms of accuracy and time complexity for high-speed data stream. This article also covers up a case study where the proposed approach is applied for an application called “web prefetching”.

Keywords: Data streams, frequent itemsets, pruning, frequent patterns, web prefetching